Lottery Correlation Calculation

 

This year's Bull Run Run 50 miler has a new entry system. In brief and to oversimplify: applicants are each assigned an independent random number. When registration closes, if the race is oversubscribed then a random starting point is selected and a block of entrants are chosen from those whose assigned numbers lie in sequence above that starting point, wrapping around so there is no endpoint. (The actual system has all sorts of other complexities, but that's what it boils down to.)

It's a fair, reasonable way to give everyone a chance to get in while avoiding the train-wreck rush-to-register that has caused some race web sites to crash from traffic overloads. But the BRR lottery system raises a fascinating personal question for me, since I would like to do the race with a friend. We've both registered — but what are the odds that we both can run? If entries were chosen at random then our chances would be independent, uncorrelated. In that case if each of us has a chance x of getting lucky (or unlucky, if you don't like running ultramarathons!) then the joint probability is simply the product of our individual odds: x * x.

But the Bull Run Run lottery system doesn't pick entrants independently — it picks them in sequence. If my friend has a number close to mine, then our chances are positively correlated: if one of us gets in, then the other probably will too. If our numbers are far apart, then we're anticorrelated.

What's the result? Since I'm a bit mathematically obsessive I stayed up late last night thinking about it — and now if you have insomnia, I have the answer for you! (^_^)

Assume random numbers are chosen on the unit interval, between 0 and 0.999..., with equal probability. Remember that everything wraps around, so immediately after 0.999... we come back to 0. Define d as the distance between my comrade's number and mine measured the short way; wraparound means that the long-way gap between us is 1 - d. Possible values of d range from 0 to 0.5, with 0.25 the average.

Let x be the fraction of all numbers chosen. If P(x,d) is the probability that both of us are selected, then:

                0        for       x < d 
    P(x,d) =    x - d    for   d < x < 1 - d
              2*x - 1    for       x > 1 - d

So what does that really mean? Glad you asked! Imagine a hula hoop, a big ring. My friend and I are randomly assigned points around the circumference of the hoop. A snake starts to crawl around the hoop. What's the chance that the snake is simultaneously touching both of our points? There are three cases:

  • if the snake is shorter than the minimum distance between our points, it will never touch both of them;
  • if the snake is longer than the short-way gap between our points around the hoop, but shorter than the long-way gap, then there's only one way for it to touch both: when it bridges the short-way gap;
  • if the snake is longer than the long-way gap, then there are two ways for it to touch both: with the snake's head and tail between us on the short side, or with them between us on the long side.

Start out with a tiny snake and let it grow. At first the snake's length makes no difference — the chance of it touching both points is 0 when the snake is too short. Once the snake is long enough to bridge the shorter gap between the two points then the fraction of the time it touches both during its journey grows directly in proportion to the snake's length. But when the snake is long enough to bridge both gaps, then the fraction of time it touches both grows twice as fast — it has two ways to overlap us.

Here's a chart of the odds (the rightmost column is for completely uncorrelated entries, picked independently):

x \ d 0.00.10.20.30.40.5random
0.00.00.00.00.00.00.00.00
0.10.10.00.00.00.00.00.01
0.20.20.10.00.00.00.00.04
0.30.30.20.10.00.00.00.09
0.40.40.30.20.10.00.00.16
0.50.50.40.30.20.10.00.25
0.60.60.50.40.30.20.20.36
0.70.70.60.50.40.40.40.49
0.80.80.70.60.60.60.60.64
0.90.90.80.80.80.80.80.81
1.01.01.01.01.01.01.01.00

Will my friend and I both get to run Bull Run Run? Right now the gap d between us is about 0.2, and x is about 0.9, so our odds look good. But x is falling fast — entries are still pouring in. Registration closes tomorrow at 9pm and we won't know the starting point for the block of successful entries until a week later. Even after that, there's a chance that many people will drop out, opening up more slots for us. So we have our fingers crossed ...

And thinking back, I remember seeing problems like this analyzed in Martin Gardner's much-beloved Scientific American "Mathematical Games" columns, ca. 1965 ...


(cf. Bull Run Run 2007 (2007-04-15), Bull Run Run 2007 Photos (2007-04-17), ...)

^z - 2008-01-14


(correlates: Critical Eyes, GrayGreenGap, Worse Obsessions, ...)